Dataset statistics
| Number of variables | 7 |
|---|---|
| Number of observations | 32797576 |
| Missing cells | 98392728 |
| Missing cells (%) | 42.9% |
| Duplicate rows | 235397 |
| Duplicate rows (%) | 0.7% |
| Total size in memory | 2.0 GiB |
| Average record size in memory | 65.0 B |
Variable types
| Numeric | 6 |
|---|---|
| Boolean | 1 |
| Dataset has 235397 (0.7%) duplicate rows | Duplicates |
x has 32792416 (> 99.9%) missing values | Missing |
y has 32792416 (> 99.9%) missing values | Missing |
z has 32792416 (> 99.9%) missing values | Missing |
Reproduction
| Analysis started | 2023-06-13 14:34:39.250075 |
|---|---|
| Analysis finished | 2023-06-13 14:49:15.258071 |
| Duration | 14 minutes and 36.01 seconds |
| Software version | pandas-profiling v3.6.6 |
| Download configuration | config.json |
sensor_id
Real number (ℝ)
| Distinct | 5160 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2713.0239 |
| Minimum | 0 |
|---|---|
| Maximum | 5159 |
| Zeros | 4067 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 500.5 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 286 |
| Q1 | 1366 |
| median | 2741 |
| Q3 | 4096 |
| 95-th percentile | 5003 |
| Maximum | 5159 |
| Range | 5159 |
| Interquartile range (IQR) | 2730 |
Descriptive statistics
| Standard deviation | 1543.4085 |
|---|---|
| Coefficient of variation (CV) | 0.56888863 |
| Kurtosis | -1.2531026 |
| Mean | 2713.0239 |
| Median Absolute Deviation (MAD) | 1369 |
| Skewness | -0.054415364 |
| Sum | 8.8980609 × 1010 |
| Variance | 2382109.8 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 5038 | 14561 | < 0.1% |
| 4979 | 14468 | < 0.1% |
| 4978 | 14321 | < 0.1% |
| 5037 | 14294 | < 0.1% |
| 4915 | 14134 | < 0.1% |
| 4918 | 14133 | < 0.1% |
| 4913 | 14003 | < 0.1% |
| 4976 | 13943 | < 0.1% |
| 5033 | 13942 | < 0.1% |
| 4858 | 13892 | < 0.1% |
| Other values (5150) | 32655885 |
| Value | Count | Frequency (%) |
| 0 | 4067 | |
| 1 | 4747 | |
| 2 | 4779 | |
| 3 | 4815 | |
| 4 | 4241 | |
| 5 | 3615 | |
| 6 | 3830 | |
| 7 | 3914 | |
| 8 | 4328 | |
| 9 | 5165 |
| Value | Count | Frequency (%) |
| 5159 | 13448 | |
| 5158 | 12522 | |
| 5157 | 12819 | |
| 5156 | 13281 | |
| 5155 | 13362 | |
| 5154 | 12683 | |
| 5153 | 12249 | |
| 5152 | 12394 | |
| 5151 | 11646 | |
| 5150 | 12061 |
time
Real number (ℝ)
| Distinct | 52433 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 5160 |
| Missing (%) | < 0.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 13130.478 |
| Minimum | 5714 |
|---|---|
| Maximum | 77785 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 500.5 MiB |
Quantile statistics
| Minimum | 5714 |
|---|---|
| 5-th percentile | 8765 |
| Q1 | 10566 |
| median | 11815 |
| Q3 | 13916 |
| 95-th percentile | 21428 |
| Maximum | 77785 |
| Range | 72071 |
| Interquartile range (IQR) | 3350 |
Descriptive statistics
| Standard deviation | 4876.7966 |
|---|---|
| Coefficient of variation (CV) | 0.37141043 |
| Kurtosis | 15.177769 |
| Mean | 13130.478 |
| Median Absolute Deviation (MAD) | 1475 |
| Skewness | 3.2034347 |
| Sum | 4.305801 × 1011 |
| Variance | 23783145 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 9885 | 11999 | < 0.1% |
| 9882 | 11974 | < 0.1% |
| 9887 | 11956 | < 0.1% |
| 9884 | 11944 | < 0.1% |
| 9888 | 11907 | < 0.1% |
| 9890 | 11743 | < 0.1% |
| 9883 | 11691 | < 0.1% |
| 9886 | 11682 | < 0.1% |
| 9880 | 11641 | < 0.1% |
| 9889 | 11625 | < 0.1% |
| Other values (52423) | 32674254 |
| Value | Count | Frequency (%) |
| 5714 | 4 | < 0.1% |
| 5715 | 8 | < 0.1% |
| 5716 | 21 | |
| 5717 | 27 | |
| 5718 | 26 | |
| 5719 | 30 | |
| 5720 | 35 | |
| 5721 | 48 | |
| 5722 | 45 | |
| 5723 | 43 |
| Value | Count | Frequency (%) |
| 77785 | 1 | |
| 76736 | 1 | |
| 76151 | 1 | |
| 75889 | 1 | |
| 75814 | 1 | |
| 75550 | 1 | |
| 75208 | 1 | |
| 75148 | 1 | |
| 75013 | 1 | |
| 75006 | 1 |
charge
Real number (ℝ)
| Distinct | 8661 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 5160 |
| Missing (%) | < 0.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.9089811 |
| Minimum | 0.025 |
|---|---|
| Maximum | 2762.0249 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 500.5 MiB |
Quantile statistics
| Minimum | 0.025 |
|---|---|
| 5-th percentile | 0.375 |
| Q1 | 0.77499998 |
| median | 1.075 |
| Q3 | 1.775 |
| 95-th percentile | 12.125 |
| Maximum | 2762.0249 |
| Range | 2761.9999 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 16.288969 |
|---|---|
| Coefficient of variation (CV) | 4.1670627 |
| Kurtosis | 846.18837 |
| Mean | 3.9089811 |
| Median Absolute Deviation (MAD) | 0.39999998 |
| Skewness | 16.464331 |
| Sum | 1.2818493 × 108 |
| Variance | 265.33052 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0.9750000238 | 1359439 | 4.1% |
| 0.9250000119 | 1353154 | 4.1% |
| 1.024999976 | 1326740 | 4.0% |
| 0.875 | 1319422 | 4.0% |
| 1.075000048 | 1265480 | 3.9% |
| 0.8249999881 | 1257551 | 3.8% |
| 1.125 | 1182114 | 3.6% |
| 0.7749999762 | 1161263 | 3.5% |
| 0.7250000238 | 1075470 | 3.3% |
| 1.174999952 | 1073846 | 3.3% |
| Other values (8651) | 20417937 |
| Value | Count | Frequency (%) |
| 0.02500000037 | 371 | < 0.1% |
| 0.07500000298 | 3957 | < 0.1% |
| 0.125 | 121103 | 0.4% |
| 0.174999997 | 181625 | 0.6% |
| 0.224999994 | 335813 | |
| 0.275000006 | 407549 | |
| 0.3249999881 | 427386 | |
| 0.375 | 465653 | |
| 0.4250000119 | 556332 | |
| 0.474999994 | 629413 |
| Value | Count | Frequency (%) |
| 2762.024902 | 1 | |
| 2728.024902 | 1 | |
| 2600.425049 | 1 | |
| 2595.375 | 1 | |
| 2555.125 | 1 | |
| 2508.024902 | 1 | |
| 2504.074951 | 1 | |
| 2439.574951 | 1 | |
| 2410.375 | 1 | |
| 2401.675049 | 1 |
auxiliary
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 5160 |
| Missing (%) | < 0.1% |
| Memory size | 500.5 MiB |
| False | |
|---|---|
| True | |
| (Missing) | 5160 |
| Value | Count | Frequency (%) |
| False | 23551893 | |
| True | 9240523 | 28.2% |
| (Missing) | 5160 | < 0.1% |
x
Real number (ℝ)
| Distinct | 118 |
|---|---|
| Distinct (%) | 2.3% |
| Missing | 32792416 |
| Missing (%) | > 99.9% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.8708295 |
| Minimum | -570.9 |
|---|---|
| Maximum | 576.37 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 2460 |
| Negative (%) | < 0.1% |
| Memory size | 500.5 MiB |
Quantile statistics
| Minimum | -570.9 |
|---|---|
| 5-th percentile | -447.74 |
| Q1 | -224.09 |
| median | 16.99 |
| Q3 | 224.58 |
| 95-th percentile | 472.05 |
| Maximum | 576.37 |
| Range | 1147.27 |
| Interquartile range (IQR) | 448.67 |
Descriptive statistics
| Standard deviation | 285.15121 |
|---|---|
| Coefficient of variation (CV) | 48.570856 |
| Kurtosis | -0.86208205 |
| Mean | 5.8708295 |
| Median Absolute Deviation (MAD) | 224.565 |
| Skewness | -0.0028952251 |
| Sum | 30293.48 |
| Variance | 81311.214 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| -279.53 | 60 | < 0.1% |
| 11.87 | 60 | < 0.1% |
| -111.51 | 60 | < 0.1% |
| -234.95 | 60 | < 0.1% |
| -358.44 | 60 | < 0.1% |
| -481.6 | 60 | < 0.1% |
| 576.37 | 60 | < 0.1% |
| 472.05 | 60 | < 0.1% |
| 330.03 | 60 | < 0.1% |
| 195.03 | 60 | < 0.1% |
| Other values (108) | 4560 | < 0.1% |
| (Missing) | 32792416 |
| Value | Count | Frequency (%) |
| -570.9 | 60 | |
| -526.63 | 60 | |
| -492.43 | 60 | |
| -481.6 | 60 | |
| -447.74 | 60 | |
| -437.04 | 60 | |
| -413.46 | 60 | |
| -403.14 | 60 | |
| -392.38 | 60 | |
| -368.93 | 60 |
| Value | Count | Frequency (%) |
| 576.37 | 60 | |
| 544.07 | 60 | |
| 505.27 | 60 | |
| 500.43 | 60 | |
| 472.05 | 60 | |
| 444.05 | 1 | < 0.1% |
| 444 | 1 | < 0.1% |
| 443.96 | 5 | < 0.1% |
| 443.95 | 2 | < 0.1% |
| 443.94 | 1 | < 0.1% |
y
Real number (ℝ)
| Distinct | 117 |
|---|---|
| Distinct (%) | 2.3% |
| Missing | 32792416 |
| Missing (%) | > 99.9% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -2.5186085 |
| Minimum | -521.08 |
|---|---|
| Maximum | 509.5 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 2580 |
| Negative (%) | < 0.1% |
| Memory size | 500.5 MiB |
Quantile statistics
| Minimum | -521.08 |
|---|---|
| 5-th percentile | -442.42 |
| Q1 | -209.07 |
| median | -6.055 |
| Q3 | 211.66 |
| 95-th percentile | 451.52 |
| Maximum | 509.5 |
| Range | 1030.58 |
| Interquartile range (IQR) | 420.73 |
Descriptive statistics
| Standard deviation | 269.40973 |
|---|---|
| Coefficient of variation (CV) | -106.96769 |
| Kurtosis | -0.8709524 |
| Mean | -2.5186085 |
| Median Absolute Deviation (MAD) | 203.945 |
| Skewness | 0.015403941 |
| Sum | -12996.02 |
| Variance | 72581.602 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 23.17 | 60 | < 0.1% |
| 179.19 | 60 | < 0.1% |
| 159.98 | 60 | < 0.1% |
| 140.44 | 60 | < 0.1% |
| 120.56 | 60 | < 0.1% |
| 101.39 | 60 | < 0.1% |
| 170.92 | 60 | < 0.1% |
| 127.9 | 60 | < 0.1% |
| 127.2 | 60 | < 0.1% |
| 125.59 | 60 | < 0.1% |
| Other values (107) | 4560 | < 0.1% |
| (Missing) | 32792416 |
| Value | Count | Frequency (%) |
| -521.08 | 60 | |
| -501.45 | 60 | |
| -481.74 | 60 | |
| -461.99 | 60 | |
| -442.42 | 60 | |
| -424.5 | 60 | |
| -422.83 | 60 | |
| -404.48 | 60 | |
| -384.3 | 60 | |
| -364.83 | 60 |
| Value | Count | Frequency (%) |
| 509.5 | 60 | |
| 490.22 | 60 | |
| 470.86 | 60 | |
| 463.72 | 60 | |
| 451.52 | 60 | |
| 432.35 | 60 | |
| 412.79 | 60 | |
| 393.24 | 60 | |
| 374.24 | 60 | |
| 354.24 | 60 |
z
Real number (ℝ)
| Distinct | 4975 |
|---|---|
| Distinct (%) | 96.4% |
| Missing | 32792416 |
| Missing (%) | > 99.9% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -23.905766 |
| Minimum | -512.82 |
|---|---|
| Maximum | 524.56 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 2736 |
| Negative (%) | < 0.1% |
| Memory size | 500.5 MiB |
Quantile statistics
| Minimum | -512.82 |
|---|---|
| 5-th percentile | -467.4815 |
| Q1 | -283.2 |
| median | -35.115 |
| Q3 | 228.5575 |
| 95-th percentile | 451.284 |
| Maximum | 524.56 |
| Range | 1037.38 |
| Interquartile range (IQR) | 511.7575 |
Descriptive statistics
| Standard deviation | 296.45656 |
|---|---|
| Coefficient of variation (CV) | -12.401049 |
| Kurtosis | -1.2180985 |
| Mean | -23.905766 |
| Median Absolute Deviation (MAD) | 257.6 |
| Skewness | 0.10163346 |
| Sum | -123353.75 |
| Variance | 87886.493 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 41.92 | 2 | < 0.1% |
| -369.39 | 2 | < 0.1% |
| 484.44 | 2 | < 0.1% |
| -176.41 | 2 | < 0.1% |
| 450.4 | 2 | < 0.1% |
| 433.38 | 2 | < 0.1% |
| 416.36 | 2 | < 0.1% |
| 399.34 | 2 | < 0.1% |
| 382.32 | 2 | < 0.1% |
| 365.3 | 2 | < 0.1% |
| Other values (4965) | 5140 | < 0.1% |
| (Missing) | 32792416 |
| Value | Count | Frequency (%) |
| -512.82 | 1 | |
| -510.57 | 1 | |
| -510.18 | 1 | |
| -509.09 | 1 | |
| -508.41 | 1 | |
| -507.4 | 1 | |
| -507.28 | 1 | |
| -507.16 | 1 | |
| -506.97 | 1 | |
| -506.62 | 1 |
| Value | Count | Frequency (%) |
| 524.56 | 1 | |
| 523.42 | 1 | |
| 516.67 | 1 | |
| 512.95 | 1 | |
| 512.74 | 1 | |
| 507.53 | 1 | |
| 506.4 | 1 | |
| 506.23 | 1 | |
| 506.14 | 1 | |
| 505.72 | 1 |
A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
| sensor_id | time | charge | auxiliary | x | y | z | |
|---|---|---|---|---|---|---|---|
| 24 | 3918 | 5928.0 | 1.325 | True | NaN | NaN | NaN |
| 24 | 4157 | 6115.0 | 1.175 | True | NaN | NaN | NaN |
| 24 | 3520 | 6492.0 | 0.925 | True | NaN | NaN | NaN |
| 24 | 5041 | 6665.0 | 0.225 | True | NaN | NaN | NaN |
| 24 | 2948 | 8054.0 | 1.575 | True | NaN | NaN | NaN |
| 24 | 860 | 8124.0 | 0.675 | True | NaN | NaN | NaN |
| 24 | 2440 | 8284.0 | 1.625 | True | NaN | NaN | NaN |
| 24 | 1743 | 8478.0 | 0.775 | True | NaN | NaN | NaN |
| 24 | 3609 | 8572.0 | 1.025 | True | NaN | NaN | NaN |
| 24 | 5057 | 8680.0 | 3.975 | True | NaN | NaN | NaN |
| sensor_id | time | charge | auxiliary | x | y | z | |
|---|---|---|---|---|---|---|---|
| 5150 | 5150 | NaN | NaN | NaN | -10.97 | 6.72 | -437.34 |
| 5151 | 5151 | NaN | NaN | NaN | -10.97 | 6.72 | -444.35 |
| 5152 | 5152 | NaN | NaN | NaN | -10.97 | 6.72 | -451.36 |
| 5153 | 5153 | NaN | NaN | NaN | -10.97 | 6.72 | -458.37 |
| 5154 | 5154 | NaN | NaN | NaN | -10.97 | 6.72 | -465.38 |
| 5155 | 5155 | NaN | NaN | NaN | -10.97 | 6.72 | -472.39 |
| 5156 | 5156 | NaN | NaN | NaN | -10.97 | 6.72 | -479.39 |
| 5157 | 5157 | NaN | NaN | NaN | -10.97 | 6.72 | -486.40 |
| 5158 | 5158 | NaN | NaN | NaN | -10.97 | 6.72 | -493.41 |
| 5159 | 5159 | NaN | NaN | NaN | -10.97 | 6.72 | -500.73 |
Most frequently occurring
| sensor_id | time | charge | auxiliary | x | y | z | # duplicates | |
|---|---|---|---|---|---|---|---|---|
| 33854 | 779 | 9880.0 | 0.975 | False | NaN | NaN | NaN | 5 |
| 114297 | 2702 | 9884.0 | 1.075 | False | NaN | NaN | NaN | 5 |
| 126583 | 3000 | 9897.0 | 0.875 | False | NaN | NaN | NaN | 5 |
| 161188 | 3840 | 9883.0 | 0.925 | False | NaN | NaN | NaN | 5 |
| 166480 | 3966 | 9969.0 | 0.925 | False | NaN | NaN | NaN | 5 |
| 3583 | 72 | 9863.0 | 0.725 | False | NaN | NaN | NaN | 4 |
| 7175 | 165 | 12213.0 | 0.875 | False | NaN | NaN | NaN | 4 |
| 8207 | 180 | 9887.0 | 0.575 | False | NaN | NaN | NaN | 4 |
| 8811 | 193 | 9881.0 | 1.075 | False | NaN | NaN | NaN | 4 |
| 13370 | 301 | 9853.0 | 0.325 | False | NaN | NaN | NaN | 4 |